Search CORE

19 research outputs found

Street Name Data as a Reflection of Migration and Settlement History

Author: Berkemer Sarah J.
Stadler Peter F.
Publication venue: 'MDPI AG'
Publication date: 20/04/2023
Field of study

Street names (odonyms) play an important role not only as descriptors of geographic locations but also due to their sociological and political connotations and commemorative character. Here we analyse street names in Europe and North America extracted from OpenStreetMap, asking in particular to what extent odonyms reflect early European settlements in the New World, i.e., the immigration of German, Austrian and Scandinavian minorities. We observe that old street names of European origin can predominantly be found in rural areas. North American street names indeed recapitulate local and regional settlement histories. The aim of this study is to demonstrate that easily accessible data sets from freely available map data such as street names convey usable information concerning migration patterns and the history of settlements in the case of European immigrants in North America as well as colonial history. We provide a freely available pipeline to analyse this kind of data

Qucosa - Publikationsserver der Universität Leipzig

Automated Design of Dynamic Programming Schemes for RNA Folding with Pseudoknots

Author: Berkemer Sarah J.
Bulteau Laurent
Marchand Bertrand
Ponty Yann
Will Sebastian
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 22nd International Workshop on Algorithms in Bioinformatics (WABI 2022)
Publication date: 01/01/2022
Field of study

Despite being a textbook application of dynamic programming (DP) and routine task in RNA structure analysis, RNA secondary structure prediction remains challenging whenever pseudoknots come into play. To circumvent the NP-hardness of energy minimization in realistic energy models, specialized algorithms have been proposed for restricted conformation classes that capture the most frequently observed configurations. While these methods rely on hand-crafted DP schemes, we generalize and fully automatize the design of DP pseudoknot prediction algorithms. We formalize the problem of designing DP algorithms for an (infinite) class of conformations, modeled by (a finite number of) fatgraphs, and automatically build DP schemes minimizing their algorithmic complexity. We propose an algorithm for the problem, based on the tree-decomposition of a well-chosen representative structure, which we simplify and reinterpret as a DP scheme. The algorithm is fixed-parameter tractable for the tree-width tw of the fatgraph, and its output represents a ?(n^{tw+1}) algorithm for predicting the MFE folding of an RNA of length n. Our general framework supports general energy models, partition function computations, recursive substructures and partial folding, and could pave the way for algebraic dynamic programming beyond the context-free case

Dagstuhl Research Online Publication Server

Orthologs, turn-over, and remolding of tRNAs in primates and fruit flies

Author: AH Sahyoun
AH Yona
AJ Vilella
AM Altenhoff
Anne Hoffmann
B Brejová
C Bermúdez-Santana
C Fried
Clara I. Bermúdez-Santana
Cristian A. Velandia-Huerto
D Lalaouna
D Liao
D Lokshtanov
DA Dalquen
DA Kramerov
DG Corneil
DM Kristensen
F Hu
G Kondrak
H Amstutz
H Storvall
HH Rogers
HH Rogers
J Hertel
JA Capra
JG Powers
JM Chen
JM Eirín-López
JS Farris
K Naidoo
K Scienski
K Van Bortle
KM Teshima
Liliana C. Romero Marroquín
M Bernt
M Blanchette
M Hellmuth
M Hellmuth
M Hernandez-Rosales
M Hiller
M Lafond
M Lafond
M Lafond
M Lajoie
M Michaud
M Nei
Maribel Hernández-Rosales
MDV Braga
MJ Weber
Nancy Retzlaff
O Elemento
O Tremblay Savard
O Tremblay-Savard
P Cantatore
P Feijão
P Kumar
Peter F. Stadler
PP Wang
PW Holland
R Dondi
R Giegé
RL Tatusov
RR Lopes
S Eger
S Prohaska
Sarah J. Berkemer
SB Needleman
T Vinař
T Yoshihisa
TA Rawlings
TJ Treangen
TM Lowe
W Otto
WJ Kent
WM Fitch
Y Liu
Y Zhang
YS Lee
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Street Name Data as a Reflection of Migration and Settlement History

Author: Berkemer Sarah J.
Stadler Peter F.
Publication venue: 'MDPI AG'
Publication date: 20/04/2023
Field of study

Qucosa

Street Name Data as a Reflection of Migration and Settlement History

Author: Berkemer Sarah J.
Stadler Peter F.
Publication venue: 'MDPI AG'
Publication date: 20/04/2023
Field of study

HSSS - Hochschulschriftenserver der SLUB

Street Name Data as a Reflection of Migration and Settlement History

Author: Peter F. Stadler
Sarah J. Berkemer
Publication venue: 'MDPI AG'
Publication date: 11/12/2020
Field of study

Multidisciplinary Digital Publishing Institute

Algebraic Dynamic Programming on Trees

Author: Christian Höner zu Siederdissen
Peter F. Stadler
Sarah J. Berkemer
Publication venue: 'MDPI AG'
Publication date: 01/01/2017
Field of study

Where string grammars describe how to generate and parse strings, tree grammars describe how to generate and parse trees. We show how to extend generalized algebraic dynamic programming to tree grammars. The resulting dynamic programming algorithms are efficient and provide the complete feature set available to string grammars, including automatic generation of outside parsers and algebra products for efficient backtracking. The complete parsing infrastructure is available as an embedded domain-specific language in Haskell. In addition to the formal framework, we provide implementations for both tree alignment and tree editing. Both algorithms are in active use in, among others, the area of bioinformatics, where optimization problems on trees are of considerable practical importance. This framework and the accompanying algorithms provide a beneficial starting point for developing complex grammars with tree- and forest-based inputs

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

Fraunhofer-ePrints

Infrared: a declarative tree decomposition-powered framework for bioinformatics

Author: Berkemer Sarah J.
Marchand Bertrand
Ponty Yann
Will Sebastian
Yao Hua-Ting
Publication venue: HAL CCSD
Publication date: 19/09/2023
Field of study

Motivation: Many bioinformatics problems can be approached as optimization or controlled sampling tasks, and solved exactly and efficiently using Dynamic Programming (DP). However, such exact methods are typically tailored towards specific settings, complex to develop, and hard to implement and adapt to problem variations. Methods: We introduce the Infrared framework to overcome such hindrances for a large class of problems. Its underlying paradigm is tailored toward problems that can be declaratively formalized as sparse feature networks, a generalization of constraint networks. Classic Boolean constraints specify a search space, consisting of putative solutions whose evaluation is performed through a combination of features. Problems are then solved using generic cluster tree elimination algorithms over a tree decomposition of the feature network. Their overall complexities are linear on the number of variables, and only exponential on the treewidth of the feature network. For sparse feature networks, associated with low to moderate treewidths, these algorithms allow to find optimal solutions, or generate controlled samples, with practical empirical efficiency. Results: Implementing these methods, the Infrared software allows Python programmers to rapidly develop exact optimization and sampling applications based on a tree decomposition-based efficient processing. Instead of directly coding specialized algorithms, problems are declaratively modeled as sets of variables over finite domains, whose dependencies are captured by constraints and functions. Such models are then automatically solved by generic DP algorithms. To illustrate the applicability of Infrared in bioinformatics and guide new users, we model and discuss variants of bioinformatics applications. We provide reimplementations (and extensions) for methods targeting RNA design, RNA sequence-structure alignment, parsimony-driven inference of ancestral traits in phylogenetic trees/networks, and coding sequence design demonstrate multidimensional Boltzmann sampling. Previous work together with novel results demonstrate the practical relevance of the framework, whose complexity is typically equivalent or better than specialized algorithms and implementations.Infrared is available at https://www.lix.polytechnique.fr/~will/Software/Infrared with extensive documentation, including various usage examples and API reference; it can be installed using Conda or from source

HAL-Ecole des Ponts ParisTech

SMORE: Synteny Modulator of Repetitive Elements

Author: Anne Hoffmann
Cameron R. A. Murray
Peter F. Stadler
Sarah J. Berkemer
Publication venue: 'MDPI AG'
Publication date: 01/10/2017
Field of study

Several families of multicopy genes, such as transfer ribonucleic acids (tRNAs) and ribosomal RNAs (rRNAs), are subject to concerted evolution, an effect that keeps sequences of paralogous genes effectively identical. Under these circumstances, it is impossible to distinguish orthologs from paralogs on the basis of sequence similarity alone. Synteny, the preservation of relative genomic locations, however, also remains informative for the disambiguation of evolutionary relationships in this situation. In this contribution, we describe an automatic pipeline for the evolutionary analysis of such cases that use genome-wide alignments as a starting point to assign orthology relationships determined by synteny. The evolution of tRNAs in primates as well as the history of the Y RNA family in vertebrates and nematodes are used to showcase the method. The pipeline is freely available

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals